Overview

Brought to you by YData

Dataset statistics

Number of variables21
Number of observations1309
Missing cells3721
Missing cells (%)13.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory214.9 KiB
Average record size in memory168.1 B

Variable types

Numeric7
Categorical7
Text7

Alerts

Age is highly overall correlated with Age_wikiHigh correlation
Age_wiki is highly overall correlated with AgeHigh correlation
Boarded is highly overall correlated with EmbarkedHigh correlation
Class is highly overall correlated with Lifeboat and 2 other fieldsHigh correlation
Embarked is highly overall correlated with BoardedHigh correlation
Fare is highly overall correlated with WikiIdHigh correlation
Lifeboat is highly overall correlated with Class and 1 other fieldsHigh correlation
Pclass is highly overall correlated with Class and 2 other fieldsHigh correlation
Sex is highly overall correlated with SurvivedHigh correlation
Survived is highly overall correlated with SexHigh correlation
WikiId is highly overall correlated with Class and 2 other fieldsHigh correlation
Survived has 418 (31.9%) missing values Missing
Age has 263 (20.1%) missing values Missing
Cabin has 1014 (77.5%) missing values Missing
Lifeboat has 807 (61.7%) missing values Missing
Body has 1179 (90.1%) missing values Missing
PassengerId is uniformly distributed Uniform
WikiId is uniformly distributed Uniform
PassengerId has unique values Unique
SibSp has 891 (68.1%) zeros Zeros
Parch has 1002 (76.5%) zeros Zeros
Fare has 17 (1.3%) zeros Zeros

Reproduction

Analysis started2024-12-31 07:14:03.805659
Analysis finished2024-12-31 07:14:19.772007
Duration15.97 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

PassengerId
Real number (ℝ)

Uniform  Unique 

Distinct1309
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean655
Minimum1
Maximum1309
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2024-12-31T07:14:19.998416image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile66.4
Q1328
median655
Q3982
95-th percentile1243.6
Maximum1309
Range1308
Interquartile range (IQR)654

Descriptive statistics

Standard deviation378.02006
Coefficient of variation (CV)0.57712986
Kurtosis-1.2
Mean655
Median Absolute Deviation (MAD)327
Skewness0
Sum857395
Variance142899.17
MonotonicityStrictly increasing
2024-12-31T07:14:20.378274image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
861 1
 
0.1%
879 1
 
0.1%
878 1
 
0.1%
877 1
 
0.1%
876 1
 
0.1%
875 1
 
0.1%
874 1
 
0.1%
873 1
 
0.1%
872 1
 
0.1%
Other values (1299) 1299
99.2%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1309 1
0.1%
1308 1
0.1%
1307 1
0.1%
1306 1
0.1%
1305 1
0.1%
1304 1
0.1%
1303 1
0.1%
1302 1
0.1%
1301 1
0.1%
1300 1
0.1%

Survived
Categorical

High correlation  Missing 

Distinct2
Distinct (%)0.2%
Missing418
Missing (%)31.9%
Memory size10.4 KiB
0.0
549 
1.0
342 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters2673
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0.0
2nd row1.0
3rd row1.0
4th row1.0
5th row0.0

Common Values

ValueCountFrequency (%)
0.0 549
41.9%
1.0 342
26.1%
(Missing) 418
31.9%

Length

2024-12-31T07:14:20.675216image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-31T07:14:20.899359image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
0.0 549
61.6%
1.0 342
38.4%

Most occurring characters

ValueCountFrequency (%)
0 1440
53.9%
. 891
33.3%
1 342
 
12.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2673
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 1440
53.9%
. 891
33.3%
1 342
 
12.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2673
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 1440
53.9%
. 891
33.3%
1 342
 
12.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2673
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 1440
53.9%
. 891
33.3%
1 342
 
12.8%

Pclass
Categorical

High correlation 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size10.4 KiB
3
709 
1
323 
2
277 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1309
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3 709
54.2%
1 323
24.7%
2 277
 
21.2%

Length

2024-12-31T07:14:21.128978image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-31T07:14:21.357899image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
3 709
54.2%
1 323
24.7%
2 277
 
21.2%

Most occurring characters

ValueCountFrequency (%)
3 709
54.2%
1 323
24.7%
2 277
 
21.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1309
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 709
54.2%
1 323
24.7%
2 277
 
21.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1309
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 709
54.2%
1 323
24.7%
2 277
 
21.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1309
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 709
54.2%
1 323
24.7%
2 277
 
21.2%

Name
Text

Distinct1307
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size10.4 KiB
2024-12-31T07:14:21.785266image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length82
Median length56
Mean length27.130634
Min length12

Characters and Unicode

Total characters35514
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1305 ?
Unique (%)99.7%

Sample

1st rowBraund, Mr. Owen Harris
2nd rowCumings, Mrs. John Bradley (Florence Briggs Thayer)
3rd rowHeikkinen, Miss. Laina
4th rowFutrelle, Mrs. Jacques Heath (Lily May Peel)
5th rowAllen, Mr. William Henry
ValueCountFrequency (%)
mr 763
 
14.3%
miss 260
 
4.9%
mrs 201
 
3.8%
william 87
 
1.6%
john 72
 
1.3%
master 61
 
1.1%
henry 49
 
0.9%
charles 39
 
0.7%
james 38
 
0.7%
george 37
 
0.7%
Other values (1940) 3742
70.0%
2024-12-31T07:14:23.027229image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4044
 
11.4%
r 2929
 
8.2%
e 2525
 
7.1%
a 2443
 
6.9%
i 1946
 
5.5%
s 1925
 
5.4%
n 1900
 
5.4%
M 1643
 
4.6%
l 1593
 
4.5%
o 1475
 
4.2%
Other values (50) 13091
36.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 35514
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4044
 
11.4%
r 2929
 
8.2%
e 2525
 
7.1%
a 2443
 
6.9%
i 1946
 
5.5%
s 1925
 
5.4%
n 1900
 
5.4%
M 1643
 
4.6%
l 1593
 
4.5%
o 1475
 
4.2%
Other values (50) 13091
36.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 35514
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4044
 
11.4%
r 2929
 
8.2%
e 2525
 
7.1%
a 2443
 
6.9%
i 1946
 
5.5%
s 1925
 
5.4%
n 1900
 
5.4%
M 1643
 
4.6%
l 1593
 
4.5%
o 1475
 
4.2%
Other values (50) 13091
36.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 35514
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4044
 
11.4%
r 2929
 
8.2%
e 2525
 
7.1%
a 2443
 
6.9%
i 1946
 
5.5%
s 1925
 
5.4%
n 1900
 
5.4%
M 1643
 
4.6%
l 1593
 
4.5%
o 1475
 
4.2%
Other values (50) 13091
36.9%

Sex
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size10.4 KiB
male
843 
female
466 

Length

Max length6
Median length4
Mean length4.7119939
Min length4

Characters and Unicode

Total characters6168
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowfemale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
male 843
64.4%
female 466
35.6%

Length

2024-12-31T07:14:23.582699image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-31T07:14:24.014175image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
male 843
64.4%
female 466
35.6%

Most occurring characters

ValueCountFrequency (%)
e 1775
28.8%
m 1309
21.2%
a 1309
21.2%
l 1309
21.2%
f 466
 
7.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6168
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1775
28.8%
m 1309
21.2%
a 1309
21.2%
l 1309
21.2%
f 466
 
7.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6168
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1775
28.8%
m 1309
21.2%
a 1309
21.2%
l 1309
21.2%
f 466
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6168
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1775
28.8%
m 1309
21.2%
a 1309
21.2%
l 1309
21.2%
f 466
 
7.6%

Age
Real number (ℝ)

High correlation  Missing 

Distinct98
Distinct (%)9.4%
Missing263
Missing (%)20.1%
Infinite0
Infinite (%)0.0%
Mean29.881138
Minimum0.17
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2024-12-31T07:14:24.491372image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0.17
5-th percentile5
Q121
median28
Q339
95-th percentile57
Maximum80
Range79.83
Interquartile range (IQR)18

Descriptive statistics

Standard deviation14.413493
Coefficient of variation (CV)0.48236093
Kurtosis0.14694764
Mean29.881138
Median Absolute Deviation (MAD)8
Skewness0.40767456
Sum31255.67
Variance207.74879
MonotonicityNot monotonic
2024-12-31T07:14:25.063252image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24 47
 
3.6%
22 43
 
3.3%
21 41
 
3.1%
30 40
 
3.1%
18 39
 
3.0%
25 34
 
2.6%
28 32
 
2.4%
36 31
 
2.4%
26 30
 
2.3%
27 30
 
2.3%
Other values (88) 679
51.9%
(Missing) 263
 
20.1%
ValueCountFrequency (%)
0.17 1
 
0.1%
0.33 1
 
0.1%
0.42 1
 
0.1%
0.67 1
 
0.1%
0.75 3
 
0.2%
0.83 3
 
0.2%
0.92 2
 
0.2%
1 10
0.8%
2 12
0.9%
3 7
0.5%
ValueCountFrequency (%)
80 1
 
0.1%
76 1
 
0.1%
74 1
 
0.1%
71 2
 
0.2%
70.5 1
 
0.1%
70 2
 
0.2%
67 1
 
0.1%
66 1
 
0.1%
65 3
0.2%
64 5
0.4%

SibSp
Real number (ℝ)

Zeros 

Distinct7
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.49885409
Minimum0
Maximum8
Zeros891
Zeros (%)68.1%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2024-12-31T07:14:25.561362image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.0416584
Coefficient of variation (CV)2.0881023
Kurtosis20.043251
Mean0.49885409
Median Absolute Deviation (MAD)0
Skewness3.8442203
Sum653
Variance1.0850522
MonotonicityNot monotonic
2024-12-31T07:14:26.077724image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 891
68.1%
1 319
 
24.4%
2 42
 
3.2%
4 22
 
1.7%
3 20
 
1.5%
8 9
 
0.7%
5 6
 
0.5%
ValueCountFrequency (%)
0 891
68.1%
1 319
 
24.4%
2 42
 
3.2%
3 20
 
1.5%
4 22
 
1.7%
5 6
 
0.5%
8 9
 
0.7%
ValueCountFrequency (%)
8 9
 
0.7%
5 6
 
0.5%
4 22
 
1.7%
3 20
 
1.5%
2 42
 
3.2%
1 319
 
24.4%
0 891
68.1%

Parch
Real number (ℝ)

Zeros 

Distinct8
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38502674
Minimum0
Maximum9
Zeros1002
Zeros (%)76.5%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2024-12-31T07:14:26.425362image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.86556028
Coefficient of variation (CV)2.2480524
Kurtosis21.541079
Mean0.38502674
Median Absolute Deviation (MAD)0
Skewness3.6690782
Sum504
Variance0.74919459
MonotonicityNot monotonic
2024-12-31T07:14:26.643291image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0 1002
76.5%
1 170
 
13.0%
2 113
 
8.6%
3 8
 
0.6%
5 6
 
0.5%
4 6
 
0.5%
6 2
 
0.2%
9 2
 
0.2%
ValueCountFrequency (%)
0 1002
76.5%
1 170
 
13.0%
2 113
 
8.6%
3 8
 
0.6%
4 6
 
0.5%
5 6
 
0.5%
6 2
 
0.2%
9 2
 
0.2%
ValueCountFrequency (%)
9 2
 
0.2%
6 2
 
0.2%
5 6
 
0.5%
4 6
 
0.5%
3 8
 
0.6%
2 113
 
8.6%
1 170
 
13.0%
0 1002
76.5%

Ticket
Text

Distinct929
Distinct (%)71.0%
Missing0
Missing (%)0.0%
Memory size10.4 KiB
2024-12-31T07:14:27.018561image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length18
Median length17
Mean length6.7906799
Min length3

Characters and Unicode

Total characters8889
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique713 ?
Unique (%)54.5%

Sample

1st rowA/5 21171
2nd rowPC 17599
3rd rowSTON/O2. 3101282
4th row113803
5th row373450
ValueCountFrequency (%)
pc 92
 
5.5%
c.a 46
 
2.7%
ca 22
 
1.3%
a/5 22
 
1.3%
2 17
 
1.0%
soton/o.q 16
 
1.0%
sc/paris 16
 
1.0%
ston/o 14
 
0.8%
w./c 14
 
0.8%
2343 11
 
0.7%
Other values (960) 1403
83.9%
2024-12-31T07:14:27.768632image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 1110
12.5%
1 1000
11.2%
2 862
9.7%
7 697
 
7.8%
4 652
 
7.3%
6 628
 
7.1%
0 610
 
6.9%
5 582
 
6.5%
9 465
 
5.2%
8 426
 
4.8%
Other values (25) 1857
20.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 8889
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 1110
12.5%
1 1000
11.2%
2 862
9.7%
7 697
 
7.8%
4 652
 
7.3%
6 628
 
7.1%
0 610
 
6.9%
5 582
 
6.5%
9 465
 
5.2%
8 426
 
4.8%
Other values (25) 1857
20.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 8889
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 1110
12.5%
1 1000
11.2%
2 862
9.7%
7 697
 
7.8%
4 652
 
7.3%
6 628
 
7.1%
0 610
 
6.9%
5 582
 
6.5%
9 465
 
5.2%
8 426
 
4.8%
Other values (25) 1857
20.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 8889
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 1110
12.5%
1 1000
11.2%
2 862
9.7%
7 697
 
7.8%
4 652
 
7.3%
6 628
 
7.1%
0 610
 
6.9%
5 582
 
6.5%
9 465
 
5.2%
8 426
 
4.8%
Other values (25) 1857
20.9%

Fare
Real number (ℝ)

High correlation  Zeros 

Distinct281
Distinct (%)21.5%
Missing1
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean33.295479
Minimum0
Maximum512.3292
Zeros17
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2024-12-31T07:14:28.142003image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.8958
median14.4542
Q331.275
95-th percentile133.65
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.3792

Descriptive statistics

Standard deviation51.758668
Coefficient of variation (CV)1.5545254
Kurtosis27.027986
Mean33.295479
Median Absolute Deviation (MAD)6.9042
Skewness4.3677091
Sum43550.487
Variance2678.9597
MonotonicityNot monotonic
2024-12-31T07:14:28.451970image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 60
 
4.6%
13 59
 
4.5%
7.75 55
 
4.2%
26 50
 
3.8%
7.8958 49
 
3.7%
10.5 35
 
2.7%
7.775 26
 
2.0%
7.2292 24
 
1.8%
7.925 23
 
1.8%
26.55 22
 
1.7%
Other values (271) 905
69.1%
ValueCountFrequency (%)
0 17
1.3%
3.1708 1
 
0.1%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 3
 
0.2%
6.45 1
 
0.1%
6.4958 3
 
0.2%
6.75 2
 
0.2%
6.8583 1
 
0.1%
ValueCountFrequency (%)
512.3292 4
0.3%
263 6
0.5%
262.375 7
0.5%
247.5208 3
0.2%
227.525 5
0.4%
221.7792 4
0.3%
211.5 5
0.4%
211.3375 4
0.3%
164.8667 4
0.3%
153.4625 3
0.2%

Cabin
Text

Missing 

Distinct186
Distinct (%)63.1%
Missing1014
Missing (%)77.5%
Memory size10.4 KiB
2024-12-31T07:14:29.018949image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length15
Median length3
Mean length3.7389831
Min length1

Characters and Unicode

Total characters1103
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique107 ?
Unique (%)36.3%

Sample

1st rowC85
2nd rowC123
3rd rowE46
4th rowG6
5th rowC103
ValueCountFrequency (%)
f 8
 
2.2%
c23 6
 
1.7%
c27 6
 
1.7%
c25 6
 
1.7%
b57 5
 
1.4%
b59 5
 
1.4%
b63 5
 
1.4%
b66 5
 
1.4%
g6 5
 
1.4%
f4 4
 
1.1%
Other values (192) 301
84.6%
2024-12-31T07:14:29.905370image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 114
 
10.3%
2 97
 
8.8%
B 96
 
8.7%
1 94
 
8.5%
3 87
 
7.9%
6 81
 
7.3%
5 79
 
7.2%
61
 
5.5%
4 58
 
5.3%
8 51
 
4.6%
Other values (9) 285
25.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1103
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
C 114
 
10.3%
2 97
 
8.8%
B 96
 
8.7%
1 94
 
8.5%
3 87
 
7.9%
6 81
 
7.3%
5 79
 
7.2%
61
 
5.5%
4 58
 
5.3%
8 51
 
4.6%
Other values (9) 285
25.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1103
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
C 114
 
10.3%
2 97
 
8.8%
B 96
 
8.7%
1 94
 
8.5%
3 87
 
7.9%
6 81
 
7.3%
5 79
 
7.2%
61
 
5.5%
4 58
 
5.3%
8 51
 
4.6%
Other values (9) 285
25.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1103
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
C 114
 
10.3%
2 97
 
8.8%
B 96
 
8.7%
1 94
 
8.5%
3 87
 
7.9%
6 81
 
7.3%
5 79
 
7.2%
61
 
5.5%
4 58
 
5.3%
8 51
 
4.6%
Other values (9) 285
25.8%

Embarked
Categorical

High correlation 

Distinct3
Distinct (%)0.2%
Missing2
Missing (%)0.2%
Memory size10.4 KiB
S
914 
C
270 
Q
123 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1307
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS
2nd rowC
3rd rowS
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
S 914
69.8%
C 270
 
20.6%
Q 123
 
9.4%
(Missing) 2
 
0.2%

Length

2024-12-31T07:14:30.254270image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-31T07:14:30.469228image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
s 914
69.9%
c 270
 
20.7%
q 123
 
9.4%

Most occurring characters

ValueCountFrequency (%)
S 914
69.9%
C 270
 
20.7%
Q 123
 
9.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1307
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S 914
69.9%
C 270
 
20.7%
Q 123
 
9.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1307
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S 914
69.9%
C 270
 
20.7%
Q 123
 
9.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1307
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S 914
69.9%
C 270
 
20.7%
Q 123
 
9.4%

WikiId
Real number (ℝ)

High correlation  Uniform 

Distinct1304
Distinct (%)100.0%
Missing5
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean658.53451
Minimum1
Maximum1314
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2024-12-31T07:14:30.740719image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile66.15
Q1326.75
median661.5
Q3987.25
95-th percentile1248.85
Maximum1314
Range1313
Interquartile range (IQR)660.5

Descriptive statistics

Standard deviation380.37737
Coefficient of variation (CV)0.57761191
Kurtosis-1.2052155
Mean658.53451
Median Absolute Deviation (MAD)330.5
Skewness-0.0074106982
Sum858729
Variance144686.95
MonotonicityNot monotonic
2024-12-31T07:14:31.067024image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
951 1
 
0.1%
842 1
 
0.1%
1311 1
 
0.1%
328 1
 
0.1%
1278 1
 
0.1%
54 1
 
0.1%
27 1
 
0.1%
667 1
 
0.1%
903 1
 
0.1%
1277 1
 
0.1%
Other values (1294) 1294
98.9%
(Missing) 5
 
0.4%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1314 1
0.1%
1313 1
0.1%
1312 1
0.1%
1311 1
0.1%
1310 1
0.1%
1309 1
0.1%
1308 1
0.1%
1307 1
0.1%
1306 1
0.1%
1305 1
0.1%
Distinct1303
Distinct (%)99.9%
Missing5
Missing (%)0.4%
Memory size10.4 KiB
2024-12-31T07:14:31.481969image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length69
Median length53
Mean length27.32362
Min length12

Characters and Unicode

Total characters35630
Distinct characters92
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1302 ?
Unique (%)99.8%

Sample

1st rowBraund, Mr. Owen Harris
2nd rowCumings, Mrs. Florence Briggs (née Thayer)
3rd rowHeikkinen, Miss Laina
4th rowFutrelle, Mrs. Lily May (née Peel)
5th rowAllen, Mr. William Henry
ValueCountFrequency (%)
mr 756
 
13.9%
miss 267
 
4.9%
mrs 195
 
3.6%
née 179
 
3.3%
william 69
 
1.3%
master 61
 
1.1%
john 61
 
1.1%
and 41
 
0.8%
henry 41
 
0.8%
mary 38
 
0.7%
Other values (2001) 3727
68.6%
2024-12-31T07:14:32.300005image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4131
 
11.6%
r 2834
 
8.0%
e 2571
 
7.2%
a 2493
 
7.0%
n 2102
 
5.9%
i 1932
 
5.4%
s 1918
 
5.4%
M 1644
 
4.6%
l 1522
 
4.3%
o 1385
 
3.9%
Other values (82) 13098
36.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 35630
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
4131
 
11.6%
r 2834
 
8.0%
e 2571
 
7.2%
a 2493
 
7.0%
n 2102
 
5.9%
i 1932
 
5.4%
s 1918
 
5.4%
M 1644
 
4.6%
l 1522
 
4.3%
o 1385
 
3.9%
Other values (82) 13098
36.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 35630
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
4131
 
11.6%
r 2834
 
8.0%
e 2571
 
7.2%
a 2493
 
7.0%
n 2102
 
5.9%
i 1932
 
5.4%
s 1918
 
5.4%
M 1644
 
4.6%
l 1522
 
4.3%
o 1385
 
3.9%
Other values (82) 13098
36.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 35630
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
4131
 
11.6%
r 2834
 
8.0%
e 2571
 
7.2%
a 2493
 
7.0%
n 2102
 
5.9%
i 1932
 
5.4%
s 1918
 
5.4%
M 1644
 
4.6%
l 1522
 
4.3%
o 1385
 
3.9%
Other values (82) 13098
36.8%

Age_wiki
Real number (ℝ)

High correlation 

Distinct78
Distinct (%)6.0%
Missing7
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean29.415829
Minimum0.17
Maximum74
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2024-12-31T07:14:32.640435image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Quantile statistics

Minimum0.17
5-th percentile6.05
Q121
median28
Q337.75
95-th percentile55
Maximum74
Range73.83
Interquartile range (IQR)16.75

Descriptive statistics

Standard deviation13.758954
Coefficient of variation (CV)0.4677398
Kurtosis0.17307643
Mean29.415829
Median Absolute Deviation (MAD)8
Skewness0.43161917
Sum38299.41
Variance189.30882
MonotonicityNot monotonic
2024-12-31T07:14:32.974941image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22 62
 
4.7%
24 56
 
4.3%
21 51
 
3.9%
28 47
 
3.6%
18 46
 
3.5%
19 45
 
3.4%
25 45
 
3.4%
20 45
 
3.4%
30 43
 
3.3%
29 42
 
3.2%
Other values (68) 820
62.6%
ValueCountFrequency (%)
0.17 1
 
0.1%
0.33 1
 
0.1%
0.42 1
 
0.1%
0.58 1
 
0.1%
0.75 2
 
0.2%
0.83 3
 
0.2%
0.92 1
 
0.1%
1 11
0.8%
2 13
1.0%
3 7
0.5%
ValueCountFrequency (%)
74 1
 
0.1%
71 3
0.2%
70 1
 
0.1%
69 1
 
0.1%
67 1
 
0.1%
66 2
 
0.2%
65 2
 
0.2%
64 5
0.4%
63 6
0.5%
62 6
0.5%
Distinct566
Distinct (%)43.4%
Missing5
Missing (%)0.4%
Memory size10.4 KiB
2024-12-31T07:14:33.462295image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length49
Median length39
Mean length23.526074
Min length6

Characters and Unicode

Total characters30678
Distinct characters88
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique348 ?
Unique (%)26.7%

Sample

1st rowBridgerule, Devon, England
2nd rowNew York, New York, US
3rd rowJyväskylä, Finland
4th rowScituate, Massachusetts, US
5th rowBirmingham, West Midlands, England
ValueCountFrequency (%)
england 318
 
7.9%
us 291
 
7.2%
new 212
 
5.2%
york 186
 
4.6%
ireland 120
 
3.0%
sweden 106
 
2.6%
london 98
 
2.4%
uk 66
 
1.6%
lebanon 65
 
1.6%
finland 56
 
1.4%
Other values (758) 2525
62.5%
2024-12-31T07:14:34.296261image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 2809
 
9.2%
2739
 
8.9%
a 2403
 
7.8%
, 2217
 
7.2%
e 2160
 
7.0%
o 1706
 
5.6%
r 1679
 
5.5%
l 1424
 
4.6%
i 1211
 
3.9%
d 1179
 
3.8%
Other values (78) 11151
36.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 30678
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 2809
 
9.2%
2739
 
8.9%
a 2403
 
7.8%
, 2217
 
7.2%
e 2160
 
7.0%
o 1706
 
5.6%
r 1679
 
5.5%
l 1424
 
4.6%
i 1211
 
3.9%
d 1179
 
3.8%
Other values (78) 11151
36.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 30678
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 2809
 
9.2%
2739
 
8.9%
a 2403
 
7.8%
, 2217
 
7.2%
e 2160
 
7.0%
o 1706
 
5.6%
r 1679
 
5.5%
l 1424
 
4.6%
i 1211
 
3.9%
d 1179
 
3.8%
Other values (78) 11151
36.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 30678
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 2809
 
9.2%
2739
 
8.9%
a 2403
 
7.8%
, 2217
 
7.2%
e 2160
 
7.0%
o 1706
 
5.6%
r 1679
 
5.5%
l 1424
 
4.6%
i 1211
 
3.9%
d 1179
 
3.8%
Other values (78) 11151
36.3%

Boarded
Categorical

High correlation 

Distinct4
Distinct (%)0.3%
Missing5
Missing (%)0.4%
Memory size10.4 KiB
Southampton
916 
Cherbourg
259 
Queenstown
119 
Belfast
 
10

Length

Max length11
Median length11
Mean length10.480828
Min length7

Characters and Unicode

Total characters13667
Distinct characters20
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSouthampton
2nd rowCherbourg
3rd rowSouthampton
4th rowSouthampton
5th rowSouthampton

Common Values

ValueCountFrequency (%)
Southampton 916
70.0%
Cherbourg 259
 
19.8%
Queenstown 119
 
9.1%
Belfast 10
 
0.8%
(Missing) 5
 
0.4%

Length

2024-12-31T07:14:34.701638image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-31T07:14:35.172965image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
southampton 916
70.2%
cherbourg 259
 
19.9%
queenstown 119
 
9.1%
belfast 10
 
0.8%

Most occurring characters

ValueCountFrequency (%)
o 2210
16.2%
t 1961
14.3%
u 1294
9.5%
h 1175
8.6%
n 1154
8.4%
a 926
6.8%
S 916
6.7%
m 916
6.7%
p 916
6.7%
r 518
 
3.8%
Other values (10) 1681
12.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 13667
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 2210
16.2%
t 1961
14.3%
u 1294
9.5%
h 1175
8.6%
n 1154
8.4%
a 926
6.8%
S 916
6.7%
m 916
6.7%
p 916
6.7%
r 518
 
3.8%
Other values (10) 1681
12.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 13667
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 2210
16.2%
t 1961
14.3%
u 1294
9.5%
h 1175
8.6%
n 1154
8.4%
a 926
6.8%
S 916
6.7%
m 916
6.7%
p 916
6.7%
r 518
 
3.8%
Other values (10) 1681
12.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 13667
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 2210
16.2%
t 1961
14.3%
u 1294
9.5%
h 1175
8.6%
n 1154
8.4%
a 926
6.8%
S 916
6.7%
m 916
6.7%
p 916
6.7%
r 518
 
3.8%
Other values (10) 1681
12.3%
Distinct291
Distinct (%)22.3%
Missing5
Missing (%)0.4%
Memory size10.4 KiB
2024-12-31T07:14:35.768565image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length39
Median length32
Mean length21.516104
Min length2

Characters and Unicode

Total characters28057
Distinct characters56
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique145 ?
Unique (%)11.1%

Sample

1st rowQu'Appelle Valley, Saskatchewan, Canada
2nd rowNew York, New York, US
3rd rowNew York City
4th rowScituate, Massachusetts, US
5th rowNew York City
ValueCountFrequency (%)
us 926
20.5%
new 661
14.7%
york 582
 
12.9%
city 247
 
5.5%
canada 125
 
2.8%
illinois 100
 
2.2%
pennsylvania 99
 
2.2%
chicago 75
 
1.7%
michigan 72
 
1.6%
jersey 60
 
1.3%
Other values (347) 1562
34.6%
2024-12-31T07:14:36.927318image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3205
 
11.4%
, 2094
 
7.5%
a 1831
 
6.5%
o 1808
 
6.4%
e 1783
 
6.4%
n 1639
 
5.8%
i 1582
 
5.6%
r 1331
 
4.7%
t 1067
 
3.8%
S 1033
 
3.7%
Other values (46) 10684
38.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 28057
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3205
 
11.4%
, 2094
 
7.5%
a 1831
 
6.5%
o 1808
 
6.4%
e 1783
 
6.4%
n 1639
 
5.8%
i 1582
 
5.6%
r 1331
 
4.7%
t 1067
 
3.8%
S 1033
 
3.7%
Other values (46) 10684
38.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 28057
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3205
 
11.4%
, 2094
 
7.5%
a 1831
 
6.5%
o 1808
 
6.4%
e 1783
 
6.4%
n 1639
 
5.8%
i 1582
 
5.6%
r 1331
 
4.7%
t 1067
 
3.8%
S 1033
 
3.7%
Other values (46) 10684
38.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 28057
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3205
 
11.4%
, 2094
 
7.5%
a 1831
 
6.5%
o 1808
 
6.4%
e 1783
 
6.4%
n 1639
 
5.8%
i 1582
 
5.6%
r 1331
 
4.7%
t 1067
 
3.8%
S 1033
 
3.7%
Other values (46) 10684
38.1%

Lifeboat
Categorical

High correlation  Missing 

Distinct24
Distinct (%)4.8%
Missing807
Missing (%)61.7%
Memory size10.4 KiB
13
42 
C
41 
15
38 
14
34 
4
 
31
Other values (19)
316 

Length

Max length5
Median length1
Mean length1.4342629
Min length1

Characters and Unicode

Total characters720
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.6%

Sample

1st row4
2nd row14?
3rd rowD
4th row15
5th row?

Common Values

ValueCountFrequency (%)
13 42
 
3.2%
C 41
 
3.1%
15 38
 
2.9%
14 34
 
2.6%
4 31
 
2.4%
5 29
 
2.2%
10 29
 
2.2%
9 26
 
2.0%
11 26
 
2.0%
3 26
 
2.0%
Other values (14) 180
 
13.8%
(Missing) 807
61.7%

Length

2024-12-31T07:14:37.743497image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
13 42
 
8.4%
c 41
 
8.2%
15 39
 
7.8%
14 35
 
7.0%
4 31
 
6.2%
5 29
 
5.8%
10 29
 
5.8%
9 26
 
5.2%
11 26
 
5.2%
3 26
 
5.2%
Other values (12) 178
35.5%

Most occurring characters

ValueCountFrequency (%)
1 243
33.8%
3 68
 
9.4%
5 68
 
9.4%
4 67
 
9.3%
6 45
 
6.2%
C 41
 
5.7%
2 32
 
4.4%
0 29
 
4.0%
9 26
 
3.6%
8 24
 
3.3%
Other values (7) 77
 
10.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 720
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
1 243
33.8%
3 68
 
9.4%
5 68
 
9.4%
4 67
 
9.3%
6 45
 
6.2%
C 41
 
5.7%
2 32
 
4.4%
0 29
 
4.0%
9 26
 
3.6%
8 24
 
3.3%
Other values (7) 77
 
10.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 720
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
1 243
33.8%
3 68
 
9.4%
5 68
 
9.4%
4 67
 
9.3%
6 45
 
6.2%
C 41
 
5.7%
2 32
 
4.4%
0 29
 
4.0%
9 26
 
3.6%
8 24
 
3.3%
Other values (7) 77
 
10.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 720
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
1 243
33.8%
3 68
 
9.4%
5 68
 
9.4%
4 67
 
9.3%
6 45
 
6.2%
C 41
 
5.7%
2 32
 
4.4%
0 29
 
4.0%
9 26
 
3.6%
8 24
 
3.3%
Other values (7) 77
 
10.7%

Body
Text

Missing 

Distinct130
Distinct (%)100.0%
Missing1179
Missing (%)90.1%
Memory size10.4 KiB
2024-12-31T07:14:38.454334image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Length

Max length14
Median length5
Mean length4.7461538
Min length3

Characters and Unicode

Total characters617
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique130 ?
Unique (%)100.0%

Sample

1st row175MB
2nd row322M
3rd row38MB
4th row234MB
5th row181MB
ValueCountFrequency (%)
96mb 1
 
0.8%
306mb 1
 
0.8%
322m 1
 
0.8%
38mb 1
 
0.8%
234mb 1
 
0.8%
181mb 1
 
0.8%
309m 1
 
0.8%
140mb 1
 
0.8%
240{?}mb 1
 
0.8%
283mb 1
 
0.8%
Other values (120) 120
92.3%
2024-12-31T07:14:39.648680image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 128
20.7%
B 120
19.4%
1 63
10.2%
2 58
9.4%
3 37
 
6.0%
8 30
 
4.9%
6 29
 
4.7%
9 29
 
4.7%
5 28
 
4.5%
7 28
 
4.5%
Other values (8) 67
10.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 617
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
M 128
20.7%
B 120
19.4%
1 63
10.2%
2 58
9.4%
3 37
 
6.0%
8 30
 
4.9%
6 29
 
4.7%
9 29
 
4.7%
5 28
 
4.5%
7 28
 
4.5%
Other values (8) 67
10.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 617
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
M 128
20.7%
B 120
19.4%
1 63
10.2%
2 58
9.4%
3 37
 
6.0%
8 30
 
4.9%
6 29
 
4.7%
9 29
 
4.7%
5 28
 
4.5%
7 28
 
4.5%
Other values (8) 67
10.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 617
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
M 128
20.7%
B 120
19.4%
1 63
10.2%
2 58
9.4%
3 37
 
6.0%
8 30
 
4.9%
6 29
 
4.7%
9 29
 
4.7%
5 28
 
4.5%
7 28
 
4.5%
Other values (8) 67
10.9%

Class
Categorical

High correlation 

Distinct3
Distinct (%)0.2%
Missing5
Missing (%)0.4%
Memory size10.4 KiB
3.0
706 
1.0
326 
2.0
272 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3912
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3.0
2nd row1.0
3rd row3.0
4th row1.0
5th row3.0

Common Values

ValueCountFrequency (%)
3.0 706
53.9%
1.0 326
24.9%
2.0 272
 
20.8%
(Missing) 5
 
0.4%

Length

2024-12-31T07:14:39.975031image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-31T07:14:40.186110image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
ValueCountFrequency (%)
3.0 706
54.1%
1.0 326
25.0%
2.0 272
 
20.9%

Most occurring characters

ValueCountFrequency (%)
. 1304
33.3%
0 1304
33.3%
3 706
18.0%
1 326
 
8.3%
2 272
 
7.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3912
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
. 1304
33.3%
0 1304
33.3%
3 706
18.0%
1 326
 
8.3%
2 272
 
7.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3912
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
. 1304
33.3%
0 1304
33.3%
3 706
18.0%
1 326
 
8.3%
2 272
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3912
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
. 1304
33.3%
0 1304
33.3%
3 706
18.0%
1 326
 
8.3%
2 272
 
7.0%

Interactions

2024-12-31T07:14:16.510106image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:05.124424image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:06.748831image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:08.609400image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:10.892245image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:13.040426image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:14.655020image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:16.751052image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:05.363665image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:06.984815image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:08.835376image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:11.219593image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:13.263152image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:14.875038image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:17.003687image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:05.606399image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:07.243060image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:09.167501image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:11.541651image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:13.512231image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:15.337479image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:17.255504image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:05.849866image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:07.487521image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:09.525472image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:11.874680image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:13.746571image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:15.593373image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:17.472836image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:06.059309image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:07.720591image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:09.828685image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:12.209286image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:13.957092image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:15.809133image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:17.706254image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:06.303426image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:08.115350image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:10.162304image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:12.554978image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:14.184154image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:16.048577image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:17.931857image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:06.519968image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:08.368451image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:10.535546image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:12.802647image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:14.410842image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
2024-12-31T07:14:16.282807image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/

Correlations

2024-12-31T07:14:40.370862image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
AgeAge_wikiBoardedClassEmbarkedFareLifeboatParchPassengerIdPclassSexSibSpSurvivedWikiId
Age1.0000.9800.0730.2940.0630.1930.169-0.2160.0270.2970.091-0.1300.155-0.350
Age_wiki0.9801.0000.1040.3230.1190.2090.151-0.2090.0220.3250.072-0.1450.132-0.353
Boarded0.0730.1041.0000.2900.9350.1760.4550.0630.0000.2880.1400.1010.1800.249
Class0.2940.3230.2901.0000.2790.4750.7070.0360.0440.9920.1180.1530.3400.899
Embarked0.0630.1190.9350.2791.0000.2180.4500.0930.0330.2780.1160.1140.1660.294
Fare0.1930.2090.1760.4750.2181.0000.3560.400-0.0040.4820.1850.4460.283-0.637
Lifeboat0.1690.1510.4550.7070.4500.3561.0000.1050.0350.7060.4180.1760.3490.308
Parch-0.216-0.2090.0630.0360.0930.4000.1051.000-0.0060.0360.2360.4380.157-0.042
PassengerId0.0270.0220.0000.0440.033-0.0040.035-0.0061.0000.0340.000-0.0320.035-0.044
Pclass0.2970.3250.2880.9920.2780.4820.7060.0360.0341.0000.1190.1530.3370.891
Sex0.0910.0720.1400.1180.1160.1850.4180.2360.0000.1191.0000.1870.5400.133
SibSp-0.130-0.1450.1010.1530.1140.4460.1760.438-0.0320.1530.1871.0000.187-0.076
Survived0.1550.1320.1800.3400.1660.2830.3490.1570.0350.3370.5400.1871.0000.345
WikiId-0.350-0.3530.2490.8990.294-0.6370.308-0.042-0.0440.8910.133-0.0760.3451.000

Missing values

2024-12-31T07:14:18.313028image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-31T07:14:18.930088image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-12-31T07:14:19.459375image/svg+xmlMatplotlib v3.8.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarkedWikiIdName_wikiAge_wikiHometownBoardedDestinationLifeboatBodyClass
010.03Braund, Mr. Owen Harrismale22.010A/5 211717.2500NaNS691.0Braund, Mr. Owen Harris22.0Bridgerule, Devon, EnglandSouthamptonQu'Appelle Valley, Saskatchewan, CanadaNaNNaN3.0
121.01Cumings, Mrs. John Bradley (Florence Briggs Thayer)female38.010PC 1759971.2833C85C90.0Cumings, Mrs. Florence Briggs (née Thayer)35.0New York, New York, USCherbourgNew York, New York, US4NaN1.0
231.03Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250NaNS865.0Heikkinen, Miss Laina26.0Jyväskylä, FinlandSouthamptonNew York City14?NaN3.0
341.01Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000C123S127.0Futrelle, Mrs. Lily May (née Peel)35.0Scituate, Massachusetts, USSouthamptonScituate, Massachusetts, USDNaN1.0
450.03Allen, Mr. William Henrymale35.0003734508.0500NaNS627.0Allen, Mr. William Henry35.0Birmingham, West Midlands, EnglandSouthamptonNew York CityNaNNaN3.0
560.03Moran, Mr. JamesmaleNaN003308778.4583NaNQ785.0Doherty, Mr. William John (aka "James Moran")22.0Cork, IrelandQueenstownNew York CityNaNNaN3.0
670.01McCarthy, Mr. Timothy Jmale54.0001746351.8625E46S200.0McCarthy, Mr. Timothy J.54.0Dorchester, Massachusetts, USSouthamptonDorchester, Massachusetts, USNaN175MB1.0
780.03Palsson, Master. Gosta Leonardmale2.03134990921.0750NaNS1108.0Pålsson, Master Gösta Leonard2.0Bjuv, Skåne, SwedenSouthamptonChicago, Illinois, USNaNNaN3.0
891.03Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.00234774211.1333NaNS902.0Johnson, Mrs. Elisabeth Vilhelmina (née Berg)26.0St. Charles, Illinois, USSouthamptonSt. Charles, Illinois, US15NaN3.0
9101.02Nasser, Mrs. Nicholas (Adele Achem)female14.01023773630.0708NaNC520.0Nassr Allah, Mrs. Adal (née Akim)[62][77]14.0Zahlé, Lebanon, Ottoman EmpireCherbourgCleveland, Ohio, US?NaN2.0
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareCabinEmbarkedWikiIdName_wikiAge_wikiHometownBoardedDestinationLifeboatBodyClass
12991300NaN3Riordan, Miss. Johanna Hannah""femaleNaN003349157.7208NaNQ1154.0Riordan, Miss Hannah18.0Glenlougha, Cork, IrelandQueenstownNew York City13NaN3.0
13001301NaN3Peacock, Miss. Treasteallfemale3.011SOTON/O.Q. 310131513.7750NaNS1119.0Peacock, Miss Treasteall4.0Southampton, Hampshire, EnglandSouthamptonElizabeth, New Jersey, USNaNNaN3.0
13011302NaN3Naughton, Miss. HannahfemaleNaN003652377.7500NaNQ1064.0Naughton, Miss Hannah21.0Donoughmore, IrelandQueenstownNew York CityNaNNaN3.0
13021303NaN1Minahan, Mrs. William Edward (Lillian E Thorpe)female37.0101992890.0000C78Q206.0Minahan, Mrs. Lillian E. (née Thorpe)37.0Fond du Lac, Wisconsin, USSouthamptonFond du Lac, Wisconsin, US14NaN1.0
13031304NaN3Henriksson, Miss. Jenny Lovisafemale28.0003470867.7750NaNS869.0Henriksson, Miss Jenny Lovisa28.0Stockholm, SwedenSouthamptonIron Mountain, Michigan, USNaN3MB3.0
13041305NaN3Spector, Mr. WoolfmaleNaN00A.5. 32368.0500NaNS1227.0Spector, Mr. Woolf23.0London, EnglandSouthamptonNew York CityNaNNaN3.0
13051306NaN1Oliva y Ocana, Dona. Ferminafemale39.000PC 17758108.9000C105C229.0and maid, Doña Fermina Oliva y Ocana39.0Madrid, SpainCherbourgNew York, New York, US8NaN1.0
13061307NaN3Saether, Mr. Simon Sivertsenmale38.500SOTON/O.Q. 31012627.2500NaNS1169.0Sæther, Mr. Simon Sivertsen43.0Skaun, Sør-Trøndelag, NorwaySouthamptonUSNaN32MB3.0
13071308NaN3Ware, Mr. FrederickmaleNaN003593098.0500NaNS1289.0Ware, Mr. Frederick William34.0Greenwich, London, EnglandSouthamptonNew York CityNaNNaN3.0
13081309NaN3Peter, Master. Michael JmaleNaN11266822.3583NaNC702.0Butrus-Youssef, Master Makhkhul4.0Sar'al[81], SyriaCherbourgDetroit, Michigan, USDNaN3.0